Word Sense Induction with Attentive Context Clustering

نویسندگان

چکیده

This paper presents ACCWSI (Attentive Context Clustering WSI), a method for Word Sense Induction, suitable languages with limited resources. Pretrained on small corpus and given an ambiguous word (a query word) set of excerpts that contain it, uses attention mechanism generating context-aware embeddings, distinguishing between the different senses assigned to word. These embeddings are then clustered provide groups main common We show performs well SemEval-2 2010 WSI task. also demonstrates practical applicability shedding light meanings words in ancient languages, such as Classical Hebrew Akkadian. In near future, we intend turn into tool linguists historians.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Word Sense Induction with Basic Clustering Algorithms

Word Sense Induction (WSI) is an important topic in natural langage processing area. For the bakeoff task Chinese Word Sense Induction (CWSI), this paper proposes two systems using basic clustering algorithms, k-means and agglomerative clustering. Experimental results show that k-means achieves a better performance. Based only on the data provided by the task organizers, the two systems get FSc...

متن کامل

SATTY : Word Sense Induction Application in Web Search Clustering

The aim of this paper is to perform Word Sense induction (WSI); which clusters web search results and produces a diversified list of search results. It describes the WSI system developed for Task 11 of SemEval 2013. This paper implements the idea of monotone submodular function optimization using greedy algorithm.

متن کامل

KSU KDD: Word Sense Induction by Clustering in Topic Space

We describe our language-independent unsupervised word sense induction system. This system only uses topic features to cluster different word senses in their global context topic space. Using unlabeled data, this system trains a latent Dirichlet allocation (LDA) topic model then uses it to infer the topics distribution of the test instances. By clustering these topics distributions in their top...

متن کامل

Word Sense Induction: Triplet-Based Clustering and Automatic Evaluation

In this paper a novel solution to automatic and unsupervised word sense induction (WSI) is introduced. It represents an instantiation of the ‘one sense per collocation’ observation (Gale et al., 1992). Like most existing approaches it utilizes clustering of word co-occurrences. This approach differs from other approaches to WSI in that it enhances the effect of the one sense per collocation obs...

متن کامل

Applying Spectral Clustering for Chinese Word Sense Induction

Sense Induction is the process of identifying the word sense given its context, often treated as a clustering task. This paper explores the use of spectral cluster method which incorporates word features and ngram features to determine which cluster the word belongs to, each cluster represents one sense in the given document set.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Data Mining and Digital Humanities

سال: 2022

ISSN: ['2416-5999']

DOI: https://doi.org/10.46298/jdmdh.9175